Goto

Collaborating Authors

 interest profile


Real-Time Summarization of Twitter

arXiv.org Artificial Intelligence

In this paper, we describe our approaches to TREC Real-Time Summarization of Twitter. We focus on real time push notification scenario, which requires a system monitors the stream of sampled tweets and returns the tweets relevant and novel to given interest profiles. Dirichlet score with and with very little smoothing (baseline) are employed to classify whether a tweet is relevant to a given interest profile. Using metrics including Mean Average Precision (MAP, cumulative gain (CG) and discount cumulative gain (DCG), the experiment indicates that our approach has a good performance. It is also desired to remove the redundant tweets from the pushing queue. Due to the precision limit, we only describe the algorithm in this paper.


JPLink: On Linking Jobs to Vocational Interest Types

arXiv.org Machine Learning

Linking job seekers with relevant jobs requires matching based on not only skills, but also personality types. Although the Holland Code also known as RIASEC has frequently been used to group people by their suitability for six different categories of occupations, the RIASEC category labels of individual jobs are often not found in job posts. This is attributed to significant manual efforts required for assigning job posts with RIASEC labels. To cope with assigning massive number of jobs with RIASEC labels, we propose JPLink, a machine learning approach using the text content in job titles and job descriptions. JPLink exploits domain knowledge available in an occupation-specific knowledge base known as O*NET to improve feature representation of job posts. To incorporate relative ranking of RIASEC labels of each job, JPLink proposes a listwise loss function inspired by learning to rank. Both our quantitative and qualitative evaluations show that JPLink outperforms conventional baselines. We conduct an error analysis on JPLink's predictions to show that it can uncover label errors in existing job posts.


Detecting Review Spammer Groups

AAAI Conferences

With an increasing number of paid writers posting fake reviews to promote or demote some target entities through Internet, review spammer detection has become a crucial and challenging task. In this paper, we propose a three-phase method to address the problem of identifying review spammer groups and individual spammers, who get paid for posting fake comments. We evaluate the effectiveness and performance of the approach on a real-life online shopping review dataset from amazon.com. The experimental result shows that our model achieved comparable or better performance than previous work on spammer detection.


College Towns, Vacation Spots, and Tech Hubs: Using Geo-Social Media to Model and Compare Locations

AAAI Conferences

In this paper, we explore the potential of geo-social media to construct location-based interest profiles to uncover the hidden relationships among disparate locations. Through an investigation of millions of geo-tagged Tweets, we construct a per-city interest model based on fourteen high-level categories (e.g., technology, art, sports). These interest models support the discovery of related locations that are connected based on these categorical perspectives (e.g., college towns or vacation spots) but perhaps not on the individual tweet level. We then connect these city-based interest models to underlying demographic data. By building multivariate multiple linear regression (MMLR) and neural network (NN) models we show how a location's interest profile may be estimated based purely on its demographics features.